generator program
CodeContests+: High-Quality Test Case Generation for Competitive Programming
Wang, Zihan, Liu, Siyao, Sun, Yang, Li, Hongyan, Shen, Kai
Competitive programming, due to its high reasoning difficulty and precise correctness feedback, has become a key task for both training and evaluating the reasoning capabilities of large language models (LLMs). However, while a large amount of public problem data, such as problem statements and solutions, is available, the test cases of these problems are often difficult to obtain. Therefore, test case generation is a necessary task for building large-scale datasets, and the quality of the test cases directly determines the accuracy of the evaluation. In this paper, we introduce an LLM-based agent system that creates high-quality test cases for competitive programming problems. We apply this system to the CodeContests dataset and propose a new version with improved test cases, named CodeContests+. We evaluated the quality of test cases in CodeContestsPlus. First, we used 1.72 million submissions with pass/fail labels to examine the accuracy of these test cases in evaluation. The results indicated that CodeContests+ achieves significantly higher accuracy than CodeContests, particularly with a notably higher True Positive Rate (TPR). Subsequently, our experiments in LLM Reinforcement Learning (RL) further confirmed that improvements in test case quality yield considerable advantages for RL.
- South America > Suriname > North Atlantic Ocean (0.14)
- South America > Peru > Ucayali Department (0.04)
- South America > Peru > Madre de Dios Department (0.04)
- (4 more...)
Solving Epistemic Logic Programs using Generate-and-Test with Propagation
This paper introduces a general framework for generate-and-test-based solvers for epistemic logic programs that can be instantiated with different generator and tester programs, and we prove sufficient conditions on those programs for the correctness of the solvers built using this framework. It also introduces a new generator program that incorporates the propagation of epistemic consequences and shows that this can exponentially reduce the number of candidates that need to be tested while only incurring a linear overhead. We implement a new solver based on these theoretical findings and experimentally show that it outperforms existing solvers by achieving a ~3.3x speed-up and solving 91% more instances on well-known benchmarks.
- North America > United States > Nebraska > Douglas County > Omaha (0.14)
- North America > United States > Vermont > Chittenden County > Burlington (0.04)
Generating Pusheen with AI
The BEGAN model uses a loss equation based on the Wasserstein distance, except its goal is to minimize the absolute value of the autoencoder losses on the real and fake images instead of the images themselves. In practice it drops the absolute value and minimizes the reconstruction loss on real images minus the reconstruction loss on fake images. Additionally, it introduces a weighting term on the fake reconstruction loss which changes proportaional to the difference between the fake and real reconstruction losses; this serves to maintain a balance between the discriminator and generator so one does not easily win over the other. I'll be releasing the code soon but to summarize, the architectures and hyperparameters that typically worked well with for the discriminator and generator were: I did a fair amount of hyperparameter searching to get a model that worked (keep scrolling for some failures) but I think it ended up being a pretty decent cat generator. Below is a training video of one of the models that I use in the demos, where every 250 steps I take 16 samples from the generator.